On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts

نویسندگان

  • Hariharan Narayanan
  • Mikhail Belkin
  • Partha Niyogi
چکیده

One of the intuitions underlying many graph-based methods for clustering and semi-supervised learning, is that class or cluster boundaries pass through areas of low probability density. In this paper we provide some formal analysis of that notion for a probability distribution. We introduce a notion of weighted boundary volume, which measures the length of the class/cluster boundary weighted by the density of the underlying probability distribution. We show that sizes of the cuts of certain commonly used data adjacency graphs converge to this continuous weighted volume of the boundary. keywords: Clustering, Semi-Supervised Learning

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Appendix to: On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts

A Regularity conditions on p and S We make the following assumptions about p: 1. p can be extended to a function p that is L−Lipshitz and which is bounded above by p max. 2. For 0 < t < t 0 , min(p(x), K t (x, y)p(y)dy) ≥ p min. Note that this is a property of both of the boundary ∂M and p. We note that since p is L−Lipshitz over R d , so is M K t (x, z)p (z)dz. We assume that S has condition n...

متن کامل

Beyond Spectral Clustering - Tight Relaxations of Balanced Graph Cuts

Spectral clustering is based on the spectral relaxation of the normalized/ratio graph cut criterion. While the spectral relaxation is known to be loose, it has been shown recently that a non-linear eigenproblem yields a tight relaxation of the Cheeger cut. In this paper, we extend this result considerably by providing a characterization of all balanced graph cuts which allow for a tight relaxat...

متن کامل

The f-Adjusted Graph Laplacian: a Diagonal Modification with a Geometric Interpretation

Consider a neighborhood graph, for example a k-nearest neighbor graph, that is constructed on sample points drawn according to some density p. Our goal is to re-weight the graph’s edges such that all cuts and volumes behave as if the graph was built on a different sample drawn from an alternative density p. We introduce the f -adjusted graph and prove that it provides the correct cuts and volum...

متن کامل

A Feature Space View of Spectral Clustering

The transductive SVM is a semi-supervised learning algorithm that searches for a large margin hyperplane in feature space. By withholding the training labels and adding a constraint that favors balanced clusters, it can be turned into a clustering algorithm. The Normalized Cuts clustering algorithm of Shi and Malik, although originally presented as spectral relaxation of a graph cut problem, ca...

متن کامل

Detecting Overlapping Communities in Social Networks using Deep Learning

In network analysis, a community is typically considered of as a group of nodes with a great density of edges among themselves and a low density of edges relative to other network parts. Detecting a community structure is important in any network analysis task, especially for revealing patterns between specified nodes. There is a variety of approaches presented in the literature for overlapping...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006